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COMPREHENSIVE SPOKEN LANGUAGE LEARNING SYSTEM 

Technical Field 

This invention relates generally to educational systems and, more particularly, to 
5 computer-assisted spoken language instruction. 

Background Art 

Many applications have been developed targeting teaching spoken language skills 
using a computer such as a PC. Some applications were very ambitious, and attempted to 
replace a teacher in a classroom or a private lesson, whereas some applications were more 

10 modest, and only targeted providing additional training and practice that could not 

otherwise be achieved without presence of a native speaker as a teacher. For example, a 
native English Speaker is a rare and expensive resource in most places in the world that 
are not themselves populated with native English Speakers. Therefore there is a 
continuous effort to increase the efficiency of properly utilizing computerized systems to 

15 support foreign language teaching and especially the spoken language skills of that 
language. 

Many language instruction inventions can also be found in the field, but most of 
them are still lacking the proper definition and set of features that will make them a 
popular means to acquire spoken language skills. 
20 It is known to provide a system that includes identification of pronunciation 

errors, where such criteria is more suitable to a phonetician, whereas an average teacher 
has requirements for a student of a foreign language (such as English) that are typically 
much lower. 
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Teachers, in general, encourage students who want to acquire the spoken language 
skills to speak first. Immediate correction on multiple errors can discourage the student, 
rather than encourage him/her in their study. 

To provide improved instruction, two application engines can be defined: 
Pronimciation and Communication. Both engines can be based on the same Speech 
Recognition engine optimized to identify pronunciation errors. But the difference 
between them is typically the set of rules that are being used to identify pronunciation 
errors and the criteria defining the errors to be reported to the user and those that should 
be ignored and skipped. 

Summary 

The present invention supports interactive dialogue in which a spoken user input 
is recorded into a computerized device and then analyzed according to phonetic criteria. 
A computerized method of teaching spoken language skills includes receiving multiple 
user utterances into a computer system, receiving criteria for pronunciation errors, 
analyzing the user utterances to detect pronunciation errors according to basic sound units 
and Pronimciation error criteria, and providing feedback to the user in accordance with 
the analysis. 

In communication mode of the application software, the system is generally more 
tolerant to pronunciation errors and can provide feedback, for example, only on those 
errors that cause the user to be misunderstood. Any other pronunciation error may be 
skipped. The described system can be generalized by defining additional two filters to 
the "ultimate" speech recognition engine targeting identifying pronunciation errors, in 
order to comply with the different application requirements. 
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In a pronunciation mode, all pronunciation errors are the targets of the Speech 
Recognition error engine, whereas in a communication mode, some of the errors are 
enabled (i.e. skipped) by the engine, some are identified but not presented as feedback to 
the user, and some are identified and presented as feedback to the user. 

It may be considered not to include the rules in the first engine at all, and therefore 
such a system can eliminate the need for the first filter. Unfortunately, it is equivalent to 
operating speech recognition of Native language speakers on non-native and this set up 
typically does not achieve the desired performance. When the set of mles and/or models 
is enlarged, some mistakes that according to teachers are not critical will not be reported 
as errors at the analysis phase. Then, when an error is identified, the application in 
communication mode may still not indicate the error to the user following the criteria that 
were set up. 

Other features and advantages of the present invention should be apparent from 
the following description of the preferred embodiment, which illustrates, by way of 
example, the principles of the invention. 

Brief Description of Drawings 

Figure 1 shows a user making use of a language training system constructed 
according to the present invention. 

Figure 2 shows a display screen of the Figure 1 system prompting a user to speak 
several words. 

Figure 3 shows a display screen of the Figure 1 system, after all words were 
recorded by the user, offering analysis of user pronunciation errors (adding Analyze 
button at the center bottom of the screen). 
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Figure 4 shows the display screen of the Figure 1 system providing pronunciation 
error analysis of the words recorded as in Fig. 3. 

Figure 5 shows the display screen of the Figure 1 system prompting a user to 
speak several expressions. 

Figure 6 shows the display screen of the Figure 1 system providing pronunciation 
error analysis of the expressions recorded as in Fig. 5. 

Figure 7 shows a display screen of an exercise training a user with the proper 
language required for dialogue. 

Figures 8 shows a display screen of Mini Dialogue after the user has recorded all 
the responses and they were anal3^ed in accordance with communication criteria, thus 
providing overall speech grade and pronunciation Help. 

Figures 9 shows a display screen of a Dialogue conducted between the user and 
the system/PC. The user is selecting to play Speaker A or B roll. Then he/she is triggered 
to record the speaker roll in response to the PC "speaking" the other speaker roll. 

Figures 10 shows a display screen of the Figure 1 system providing 
commxmication performance result and offering pronunciation error analysis of the 
dialogue recorded according to the application described in Fig. 9. 

Detailed Description 

Figure 1 is a representation of a user 102 using the Spoken Language System 
constructed according to the current invention. The system shown in Figure 1 includes a 
PC 106 with a Sound Card, speakers or headset 122, and a microphone 126. The PC 
plays multiple roles in the system. Its CPU runs the application, its display 120 presents 
the application screens, and its audio interface plays the application prompts through the 
speakers or headset 122. In addition, the PC Audio input is being used to record (via the 
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microphone 126) the user produced utterances. These utterances are recorded to the PC 
memory to be later played back to the user and/or analyzed according to pronunciation or 
communication analysis criteria. 

Figure 2 shows a visual display of the screen 120 that prompts or triggers the user 
to speak multiple words. In the current application software, the user first produces 
(speaks) all the words. Each word is displayed on the screen and the user can listen to it 
being spoken by clicking on the play button located on the left side of each word. The 
user clicks on the microphone button and then records the user's pronunciation of the 
word. During recording, a record level indicator is displayed in the recorded word row. If 
recording is rejected because the speech was too soft, too loud etc., an error message is 
immediately displayed on the pronounced word row. If the word was properly recorded 
(regardless of pronunciation errors), a signal symbol is presented on the display and a 
user play button is added on the right side of the microphone display icon. The Student 
Play button enables the user to play his/her recorded word. Each word translation is also 
displayed on the right side of the word row. The user has to finish recording all the 
prompted words in order to continue with the application. The words can be recorded in 
any order as long as, at the end, all the prompted words are recorded. The user may also, 
after listening to his/her recordings, elect to re-record a certain word. The user can do so, 
and the last recording of each word is taken into account for the following parts of the 
application. 

Figure 3 shows a visual display of the screen described in Figure 2 above, after all 
words were successfully recorded. Some words may have been recorded several times, 
but there is no extemal indication to the number of times each word was recorded. Only 
the last recording will be analyzed in the following part of the application software. After 
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all words are recorded, a new button is presented at the center bottom of the display - 
shown in Figure 3 as "Analyze Results". This button enables the user to run the 
application software analysis program, and analyze user recordings of the presented 
words to find pronunciation errors. 

Figure 4 shows a visual display of a feedback of pronunciation error analysis 
performed on the words presented in Figure 3 above, after the user had clicked on the 
Analyze Results display button. Up to five pronunciation errors are displayed in the 
pronunciation feedback window. Each pronunciation error is identified by English letters 
(e.g. IH) symbolizing the phoneme that was not pronounced properly, and/or another text 
that provides the user indication on the error phoneme (e.g. sheep). This kind of 
simplified text may be required, since most users of such systems are not familiar with the 
phonetic alphabet. When one of these error phoneme buttons is clicked, the system 
displays all words where the error was found, and indicates the exact location of the error 
within the word. This is done by displaying the "spelling" of the word, and adding a red 
triangle below the part of the text that represents the phoneme that was identified as 
pronounced incorrectly. The user is also offered additional training and practice for the 
specific sound that was mispronounced. By clicking on the "Train Me" button shown in 
Figure 4, that appears below the mispronounced phoneme, the user is being introduced to 
another part of the application that teaches and practices the student how to properly 
produce the sound. 

Figure 5 shows a visual display of a similar screen as in Figure 2, which triggers 
the user to speak. In Figure 2, the recorded utterances were words, whereas in Figure 5 
these are expressions composed of multiple words. The application is also similar to the 
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one described in Figure 2 above, that encourages the user to record all expressions before 
offering Pronunciation analysis . 

Figure 6 shows the computer system display screen providing feedback on the 
user's production of the inputted expressions. As in Figure 4 above, where analysis results 
5 are displayed for words, the Figure 5 screen provides feedback on the analysis results for 
the recorded expressions. Up to five phonemes that were mispronounced are displayed. 
When a user selects any of them, the application presents the expressions and exact 
location within each of the expressions where this error was identified. The user may also 
click on the newly appeared button — "Train Me" - that will offer additional teaching, 

10 training, and exercises on the proper production of the mispronoimced sound (phoneme). 
Figure 7 shows a visual display of the system teaching the user the correct 
language required to conduct a dialogue. There are multiple questions and multiple 
answers for each of them. The user is requested to select the appropriate answer to each 
statement in the question. This exercise trains the user in dialogue language prior to the 

15 oral dialogue that follows this part of the application. A score is given to the overall 
student performance in this exercise. 

Figure 8 shows a display screen of the computer system that practices the user in 
dialogues. This part of the application software is called "Mini Dialogue" since the 
system/PC represents one of two speakers, where the user is the other one. These are 

20 short dialogues, one phrase for each speaker. The system prompts the user and he/she is 
requested to orally complete the other speaker role in the dialogue. After all recordings 
have been completed, the system analyzes the user utterances and provides a grade on the 
user overall speech performance as well as providing pronunciation help. The Speech 
Recognition engine being used in this application is the communication one, where only a 
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subset of the pronunciation rules are active and the system emphasizes more on the 
communication skills than on the pronunciation skills. 

Figure 9 shows a display screen of the computer system that practices a more 
complete dialogue (compared to the Mini Dialogues presented in Figure 8 above). In this 
5 case the user selects to be either speaker A or speaker B and then orally interacts with the 
PC that plays the other speaker role. The exercise goal is to improve and practice the user 
fluency in spiking the language while conducting a dialogue. Unless the user makes a 
"significant" mistake, the system will not conunent and let the user record his/her part of 
the dialogue without interference. 

10 Figure 10 shows a display screen of the computer system that practices dialogues 

as presented in Figure 9 above, where all user utterances were successfully recorded and 
are analyzed for fluency, intelligibility and pronunciation errors. The speech score is 
immediately presented, where in order to receive the pronimciation feedback the user 
should click on the Pronunciation Help button ("See your errors"), and then the 

1 5 pronunciation errors are presented (in a similar way as for the words and expressions). 
This part of the application uses the Communication Engine, which is the same Speech 
Recognition Engine that operates with sub set of the Pronunciation Errors rules, and thus 
enables (skips) certain pronunciation errors that are not effecting the intelligibility of the 
utterance, and indicate others that are unacceptable by an average teacher in a classroom. 
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